Classifying Unseen Cases with Many
نویسندگان
چکیده
Handling missing attribute values is an important issue for classiier learning, since missing attribute values in either training data or test (unseen) data aaect the prediction accuracy of learned classi-ers. In many real KDD applications, attributes with missing values are very common. This paper studies the robustness of four recently developed committee learning techniques, including Boosting, Bagging, Sasc, and SascMB, relative to C4.5 for tolerating missing values in test data. Boosting is found to have a similar level of robustness to C4.5 for tolerating missing values in test data in terms of average error in a representative collection of natural domains under investigation. Bagging performs slightly better than Boosting, while Sasc and SascMB perform better than them in this regard, with SascMB performing best.
منابع مشابه
A Critique of the View Claiming Conflict in the Verses of the Knowledge of the Unseen
The claim of conflict in the verses of the knowledge of the unseen in Quran is one of those made by Brasher – the Jewish orientalist. He believes that the verses which consider the knowledge of the unseen to be only specific to God are in conflict with those verses referring apparently to the Prophet (p.b.u.h) and some of the divine selected people's awareness of the unseen. Classifying the ver...
متن کاملClassifying Unseen Cases with Many Missing Values
Handling missing attribute values is an important issue for classiier learning, since missing attribute values in either training data or test (unseen) data aaect the prediction accuracy of learned classiiers. In many real KDD applications, attributes with missing values are very common. This paper studies the robust-ness of four recently developed committee learning techniques, including Boost...
متن کاملGenerating Production Rules from Decision Trees
Many inductive knowledge acquisition algorithms generate classifiers in the form of decision trees. This paper describes a technique for transforming such trees to small sets of production rules, a common formalism for expressing knowledge in expert systems. The method makes use of the training set of cases from which the decision tree was generated, first to generalize and assess the reliabili...
متن کاملExtracting and Learning an Unknown Grammar with Recurrent Neural Networks
Simple secood-order recurrent netwoIts are shown to readily learn sman brown regular grammars when trained with positive and negative strings examples. We show that similar methods are appropriate for learning unknown grammars from examples of their strings. TIle training algorithm is an incremental real-time, recurrent learning (RTRL) method that computes the complete gradient and updates the ...
متن کاملClassifying Unseen Instances by Learning Class-Independent Similarity Functions
Zero-shot recognition (ZSR) deals with the problem of predicting class labels for target domain instances based on source domain side information (e.g. attributes) of unseen classes. We formulate ZSR as a binary prediction problem. Our resulting classifier is class-independent. It takes an arbitrary pair of source and target domain instances as input and predicts whether or not they come from t...
متن کامل